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Abstract 

We consider a well defined joint detection and parameter estimation problem. By combining the 
Baysian formulation of the estimation subproblem with suitable constraints on the detection subproblem 
we develop optimum one- and two-step test for the joint detection/estimation case. The proposed com- 
bined strategies have the very desirable characteristic to allow for the trade-off between detection power 
and estimation efficiency. Our theoretical developments are then applied to the problems of retrospective 
changepoint detection and MIMO radar. In the former case we are interested in detecting a change in 
the statistics of a set of available data and provide an estimate for the time of change, while in the latter 
in detecting a target and estimating its location. Intense simulations demonstrate that by using the jointly 
optimum schemes, we can experience significant improvement in estimation quality with small sacrifice 
in detection power. 

Index Terms 

Joint detection-estimation. Retrospective change detection, MIMO radar. 

I. Introduction 

There are important applications in practice where one is confronted with the problem of distinguishing 
between different hypotheses and, depending on the decision, to proceed and estimate a set of relevant 
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parameters. Characteristic examples are: Detection and estimation of objects from images HI; Retro- 
spective changepoint detection, where one desires to detect a change in statistics but also estimate the 
time of the change 0, ||3]; Defect detection from radiographics, where in addition to detecting presence 
of defects one would also like to find their position and shape Pl; finally MIMO radar where we are 
interested in detecting the presence of a target and also estimate several target characteristics as position, 
speed, etc. All these applications clearly demand for detection and estimation strategies that address the 
two subproblems in a jointly optimum manner. 

In the literature, there are basically two (mainly ad-hoc) approaches that deal with combined problems. 
The first consists in treating the two subproblems separately and applying in each case the corresponding 
optimum technique. For instance one can use the Neyman-Pearson optimum test for detection and the 
optimum Bayesian estimator for parameter estimation to solve the combined problem. As we will see 
in our analysis, and it is usually the case in combined problems, treating each part separately with the 
optimum scheme, does not necessarily result in optimum overall performance. The second methodology 
consists in using the Generalized Likelihood Ratio Test (GLRT) which detects and estimates at the same 
time with the parameter estimation part relying on the maximum likelihood estimator. Both approaches 
lack versatility and are not capable of emphasizing each subproblem according to the needs of the 
corresponding application. 

Surprisingly, one can find very limited literature that deals with optimum solutions of the joint de- 
tection and estimation problem. A purely Bayesian technique is reported in Q, whereas a combination 
of Bayesian and Neyman-Pearson-like methodology is developed in [6|. Specifically in [6] the error 
probabilities under the two hypotheses, used in the classical Neyman-Pearson approach, are replaced by 
estimation costs. Mimicking the Neyman-Pearson formulation and constraining the estimation cost under 
the nominal hypothesis while optimizing the coiTcsponding cost under the alternative, gives rise to a 
number of interesting combined tests that can be used in place of GLRT. 

Here we will build upon the methodology of ||6l but we are going to formulate the combined problem in 
a more natural way. In particular we will define a performance measure for the estimation part which we 
are going to optimize assuring, in parallel, the satisfactory performance of the detection part by imposing 
suitable constraints on the decision error probabilities. This idea will lead to two novel combined tests 
that have no equivalent in ||5l,||6l. 

We would like to point out that the theory in |Ill,||6l as well as the one we are going to develop in our 
work, makes sense only when both subproblems constitute desired goals in our setup, that is, when we 
are interested in detecting and estimating. These results cannot provide optimum schemes for the case 
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where one is interested only in detection and is forced to use parameter estimation due to presence of 
nuisance parameters. 

Our article is organized as follows: in Section IT we define the joint detection and estimation problem 
and propose two different optimal solutions. As a quick example, our results are then applied to the 
problem of retrospective change detection. In Section 111 we make a thorough presentation of the MIMO 
radar problem under a joint detection and estimation formulation and use the results of the previous 
section in order to solve this problem optimally. Specifically we develop closed form expressions for all 
quantities that are needed to apply our theory and perform simulations to evaluate the performance of the 
optimum schemes, addressing also computational issues. Finally, in Section TV we have our concluding 
remarks. 

II. Optimum Joint Detection and Parameter Estimation 

Let us define the problem of interest. Motivated by most applications mentioned in the Introduction, we 
limit ourselves to the binary hypothesis case with parameters present only under the alternative hypothesis. 
Suppose we are given an observation signal X for which we have the following two hypotheses 

Ho : X ~ fo{X) 

Hi : X ~ h{X\0), 9 - 7r(e), 
where /o(X),/i(X|^),7r(0) are known pdfs. Specifically, we assume that under Hq we know the pdf 
of X completely, whereas under Hi the pdf of X contains a collection of random parameters 9 for 
which we have available some prior pdf 7r(0). The goal is to develop a mechanism that distinguishes 
between Hq, Hi and, furthermore, every time it decides in favor of Hi it provides an estimate 9 for 9. 
Our combined detection/estimation scheme is therefore comprised of a randomized test {8q{X) , 5i{X)} 
with 5i{X) denoting the randomization probability for deciding in favor of H^; and a function 9{X) that 
provides the necessary parameter estimates. Clearly 8i{X) > and So{X) + di{X) = 1. 

Let us recall, very briefly, the optimum detection and estimation theory when the two subproblems are 
considered separately. 

Neyman- Pearson hypothesis testing: Fix a level a G (0, 1); if D denotes our decision then we are 
interested in selecting a test (namely the randomization probabiUties Si{X)) so that the detection proba- 
bility Pi(D = Hi) is maximized subject to the false alarm constraint Po(D = Hi) < a. Equivalently, the 
previous maximization can be replaced by the minimization of the probability of miss Pi(D = Hq). The 
optimum detection scheme is the well celebrated UkeUhood ratio test, which takes the following form 
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for our specific setup 

"^^""^ = wr) = — wn — g 

In other words we decide Hi whenever the Ukelihood ratio C{X) exceeds the threshold 7np; Hq whenever 
it falls below and randomize with a probability p when the likelihood ratio is equal to the threshold. The 
threshold 7np and the probability p are selected to satisfy the false alarm constraint with equality. The 
randomization probabiUties 6^^ {X) , {X) corresponding to the Neyman-Pearson test are given by 

where 1_4 denotes the index function of the set A. 

Bayesian parameter estimation: Suppose that we know with certainty that the observations X come 
from hypothesis Hi, then we are interested in providing an estimate 9{X) for the parameters 6. We 
measure the quality of our estimate with the help of a cost function C{9, 6) > 0. We would like to select 
the optimum estimator in order to minimize the average cost Ei[C{6{X),9)], where expectation is with 
respect to X and 9. 

From lITl Page 142] we have that the optimum Bayesian estimator is the following minimizer (provided 
it exists) 

9oiX) = argm{C{U\X), (3) 

where C{U\X) is the posterior cost function 

F[rrr/mi;^l ICiU,9)hiX\9)7ri9)d9 J C{U,9)MX\9)n{9) d9 
CiU\X) = E,[CiU,9)\X] = jf^^x\9M9)d9 = WO ' 

and expectation, as we can see from the last equality, is with respect to 9 for given X. Finally we denote 
the optimum posterior cost as Co{X), that is, 

Co{X) = mf C{U\X) = C{9o{X)\X). (5) 

This quantity will play a very important role in the development of our theory as it constitutes a genuine 
quality index for the estimate 9o{X). 

Let us now consider the combined problem. We recall that the hypothesis testing part distinguishes 
between Hq and Hi. As we have seen, the Neyman-Pearson approach provides the best possible detection 
structure for controlling and optimizing the corresponding decision error probabilities. However with a 
decision mechanism that focuses on the decision errors, we cannot necessarily guarantee efficiency for 
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the estimation part. Consequently, we understand, that the detection part cannot be treated independently 
from the estimation part. Following this rationale, we propose two possible approaches involving single 
and two-step schemes that differ in the number of decision mechanisms they incorporate and the way 
they combine the notion of reliable estimate with the detection subproblem. 

A. Single-Step Tests 

Let us begin our analysis by introducing a proper performance measure for the estimation subproblem. 
Following the Bayesian approach we assume the existence of the cost function C{9,6) > 0. Computing 
the average cost that will play the role of our performance measure, is not as straightforward as in the 
pure estimation problem and requires some consideration. Note that an estimate 6{X) is provided only 
when we decide in favor of Hi. On the other hand averaging of C{9,9) makes sense only under the 
alternative hypothesis Hi since under the nominal Hq there is no true parameter 9. Consequently we 
propose the following performance criterion 

J{6o,6^,9) = E^[C{9{X),9)\D = Hi] = p)(D = Hi^ ' 

where expectation is with respect to X and 9. We realize that with our criterion, the estimation per- 
formance depends not only on the estimator but also on the detection mechanism. As we can see, we 
compute the average cost over the event {D = Hi}, which is the only case an estimate is available. 

One would immediately argue that the measure in (|6]l does not consider in any sense the decision errors, 
that is, the quality of the detector. However, these errors can be efficiently controlled through suitable 
constraints. Specifically we can impose the familiar false alarm constraint Po(D = Hi) < a but also a 
constraint on the probability of miss Pi(D = Hq) < P where a,/3 G (0, 1). With these two constraints 
we have complete control over the decision mechanism and therefore, now, it makes sense to attempt to 
minimize the conditional average estimation cost J'{6o, 6i,9) over the decision rule {6o{X), 6i{X)} and 
the estimator 9{X). Note that the two constraints guarantee satisfactory performance for the detection 
part and, by minimizing the criterion, we can enjoy optimum performance in the estimation part. 

Let us carry out the desired optimization gradually. We first fix the decision rule {6o{X), 6i{X)} and 
optimize ^{80, 81, 9) with respect to the estimator 9{X). We have the following lemma that provides the 
solution to this problem. 

Lemma 1: Let (p{X) > be a scalar function, then the following functional of 9{X) 

^(,. ^ 11 v{X)C{9{X),9)h{X\9)7r{9)d9dX 
JJ^iX)MX\9)7ri9)d9dX 
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is minimized when 0{X) is the optimum Bayesian estimator 9o{X) defined in ([3]) and Q. 
Proof: The proof is simple. We can write 

fj ^{x)c{e{x),e)fi{x\d)Tr{e)dedx 



v{e) 



JJip{X)fi{X\e)7r{d)dddX 

j^ix) (jcieix),e)Mx\9)7T{e)de) dx 



Jipix) {J h{x\e)7rie)d9) dx 

^ Jip{X)mX)\X)h{X)dX 

Jip{x)Mx)dx 

^ J^{X)mfuC{U\X)fi{X)dX 
J^{X)h{X)dX 

_ J^{x)c{eo{x)\x)Mx)dx _ J^{x)Co{x)Mx)dx 



(8) 



J^{X)fi{X)dX fip{X)h{X)dX 
where for the last two equalities we used Q. ■ 
Lemma[T] is a very interesting result because it demonstrates an extended optimality property for the 
classical Bayesian estimator. In particular by selecting (p{X) = Si{X) we conclude that 9o{X) continues 
to be optimum even if estimation is dictated by a decision mechanism and not performed over all data 
X, as is the usual practice with Bayesian estimation. Consequently, we can now fix our estimator to the 
Bayesian estimator 6o{X) with corresponding optimized performance measure equal to 

-r^A TfA A fl^ j6,{X)Co{X)h{X)dX 

j{So,s,) = j{So,5„e.) = j^^^x)h{x)dx ■ 

It is clear that our intention is to further minimize J{5q,5i) over the class of detectors that satisfy the 
two error constraints. Before addressing this problem however, we need to make some remarks. 

Remark 1: One can argue that by constraining the false alarm probability to a and by using the 
Neyman-Pearson optimum test for detection and then the Bayesian estimator for estimation (in other 
words, treating the two subproblems separately) has definite optimality properties, since this combination 
optimizes both the detection and the estimation part. This is indeed true, however with such a scheme the 
main emphasis is on the detection part. For estimation, after optimizing the corresponding performance 
(by using 9o{X)), we have no further control. In fact if the resulting estimation performance is not 
satisfactory, there is no room for further improvement. This weakness is clearly circumvented by the 
proposed formulation which offers, as we discuss next, the additional flexibility to trade detection power 
for estimation efficiency, according to the needs of the designer. 

Remark 2: We recall that in our setup we have the two constraints Po(D = Hi) < a and Pi(D = Hq) < 
/3. By fixing the false alarm probability to a, the probability of miss is minimized by the Neyman-Pearson 
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test; call this minimum value /3(a). Since no test, with false alarm probability not exceeding a, can have 
a probability of miss that goes below /3(a), this suggests that in our constraint on the probability of miss, 
P must be selected to satisfy /3 > /3(a). We are thus reducing, in a controlled manner, the detection 
power as compared to the Neyman-Pearson test (since we allow more misses) aiming in improving the 
effectiveness of our estimation. We have the following theorem that provides the optimum scheme. 

Theorem 1: Consider the two constraints Po(D = Hi) < a and Pi(D = Hq) < /3, where < a < 1 
and /3(a) < /3 < 1 with /3(a) denoting the probability of miss of the Neyman-Pearson test. Let Aq > 
be the solution of the equatiorj^ 

Pi (Ao > Co{X)) = 1-/3, (10) 

where Co{X) is defined in ([5]). Then the optimum combined scheme is comprised of the Bayesian 
estimator 9o{X) defined in Q, Q, for the estimation part while the decision rule that optimizes the 
average conditional cost J{5q, 5i) in Q under the two error constraints is given by 

Hi 

Co{X) = Xo, iia>Po{\o>Co{X)) (11) 

Ho 

^5^[A-Co(X)] |7, if a < Po (Ao > Co(X)) , (12) 

where in ( [T2] ) A, 7 are selected so that the two error probability constraints are satisfied with equality. 
Proof: The proof is presented in the Appendix. ■ 

From ( [TT] ) and ([12]) we deduce that the optimum detector takes into account the estimation part through 
Co{X) which constitutes a quality index for the estimate 6o{X). If this index is sufficiently large then, 
in both cases, the test decides in favor of Hq. In particular, in ([12]), this decision may occur even if the 
classical likelihood ratio exceeds the threshold 7np, suggesting decision in favor of Hi. 

Sumarizing, our first optimum combined test consist in applying ( [TT] ) or ( [T2] ) to decide between the 
two hypotheses and every time we make a decision in favor of Hi we use 6o{X) defined in ([3]) to provide 
the optimum parameter estimate. 

B. Two-Step Tests 

In the previous setup our decision was between Hq and Hi and we were sacrificing detection power 
to improve estimation. However, in most applications, giving up part of the detection capacity may be 

'For simplicity we assume that Co{X) and fi{X)/ fo{X), when considered as random variables, have no atoms under both 
hypotheses (the corresponding pdfs have no delta functions). This avoids the need for randomization every time a test statistic 
hits a threshold. 
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regarded as undesirable. For example in MIMO radar it is still helpful to detect a target even if we cannot 
reliably estimate its parameters. 

It is possible to preserve the detection power and at the same time ameliorate the estimation per- 
formance if we follow a slightly different approach that involves two-step mechanisms. Specifically we 
propose the use of an initial detection strategy that distinguishes between Hq and Hi; whenever we decide 
in favor of Hi then, at a second step, we compute the estimate 9{X) and employ a second test that decides 
whether the estimate is reliable or unreliable, denoted as Hi,, and Hi„ respectively. Consequently we 
propose to make three different decisions Hq, Hi^ and Hi„ with the union of the last two corresponding 
to hypothesis Hi. As we can see, we "trust" the estimate 6{X) only when we decide in favor of Hi^, 
but we have detection even if we discard the estimate as unreliable, that is, we decide Hi„. 

For the first test we use our familiar randomization probabilities {5q{X), 5i{X)} while for the second 
we employ a new pair {qij.{X), qiu{X)}. The latter functions are the randomization probabilities needed 
to decide between reliable/unreliable estimation given that the first test decided in favor of Hi. Therefore 
we have qir{X), qiu{X) > and qir{X)-\-qiu{X) = 1. For every combination of the four randomization 
probabilities we define, similarly to the previous subsection, the corresponding average conditional cost 
for the estimator 9{X), namely 

-ra A m F ^r(n(Y^ mm h i I SiiX)qir{X)CieiX)\X)MX)dX 
j{6o,6^,q,r,qiu,e) = E,[c{eix),9)\D = Hi.] = j s,^x)q,Ax)hix)dx • ^'^^ 

As we can see, we now condition on the event {D = Hi.} since this is the only case when the estimate 
0{X) is accepted. We also note that, for given X, the probability to decide in favor of Hi. is 6i{X)qir{X) 
because we must decide in favor of Hi in the first step (with probability 6i{X)) and for Hi^ in the second 
(with probability qir{X)). 

In the first step we would like to adopt the best possible detector to select between Hq and Hi. 
We follow the classical Neyman-Pearson approach and impose the false alarm probability constraint 
Po(D = Hi) < a while we minimize the probability of miss Pi(D = Hq). This leads to the Neyman- 
Pearson test defined in ([T]) with corresponding randomization probabilities 5q^{X), d^^{X) given in Q. 

Having identified the first, let us proceed to the second step of our detection/estimation mechanism that 
involves parameter estimation and a second test that labels the estimate as reliable/unreliable. Consider the 
average conditional cost J'{dQ^,6^^,qir,qiu,0), assume qir{X), qi.a{X) fixed, then from Lemma[T]and 
by selecting (p{X) = 5^^{X)qir{X), we conclude that this criterion is minimized when 9{X) = 9o{X), 
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that is, again with the optimum Bayes estimator defined in Q and (|4]). Call 

= , , ^,^NP eNP n ^ !5'l^{X)qrr{X)Co{X)h{X)dX 

J{qir,qiu)-J{6, ,5, ,q,r,qiu,9o} - J ^nv ^x)q,M) h{X)dX ' ^^^^ 

the corresponding performance. It is then clear that we would like to minimize even further this criterion by 
selecting properly our second decision mechanism which is expressed with the help of the randomization 
probabilities {qij.{X), qiu{X)}. Note however that, in addition to this minimization, we are also interested 
in generating as many "reliable estimates" as possible when applying the second test. These two goals 
are clearly conflicting, therefore we adopt a Neyman-Pearson-like approach in order to come up with an 
optimum scheme. In other words we constrain one quantity and optimize the other. 

To find a suitable constraint, because qir{X) < 1, the probability Pi(D = Hi^) of deciding in favor 
of Hir- (reliable estimate) satisfies 

Pi(D = Hi,) = J 5T{X)qir{X)h{X)dX < J 6^''{X)fi{X)dX = Pi(D = Hi) = 1 - /3(a). (15) 

In other words this probability is upper bounded by the detection probability 1 — /3(a) of the Neyman- 
Pearson test where, we recall, /3(a) denotes the corresponding probability of miss. This inequality reveals 
the obvious fact that, only a portion of our initial decisions in favor of Hi provide reliable estimates in 
the second step. Actually it is this part we intend to control by imposing the following inequality 

1 - /3 < Pi(D = Hi,) = j 6nX)qir{X)h{X)dX (16) 



with 1 > /3 > /3{a). The constraint in (16) expresses our desire that at least a fraction of < 



p^^|-p~ H^') ^ ^ of the initial decisions in favor o/ Hi must provide reliable estimates. Subject to this 
constraint the goal is to obtain the best possible estimation performance, that is, minimize the performance 
measure J{qir,qiu)- The solution to this optimization problem is given in the next lemma. 

Lemma 2: Let 1 > /3 > /3(a), then the test that minimizes the average conditional cost J{qir,qiu) 
defined in ( [14] ) subject to the constraint in (^6), is given by 



Hi, 

Co{X) = X, (17) 

Hi„ 

where A is selected to satisfy ( [T6| ) with equality and Co{X) is defined in (|5]). 

Proof: The proof follows a methodology which is very similar to the one used in the proof of 
Theorem[T] Since it presents no particular difficulties, it is omitted. ■ 
As in the previous subsection, Co{X) constitutes a quality index for the estimate 9o{X). With Lemmajl] 
we end up with the very plausible decision rule of accepting 9o{X) as reliable whenever this index is 
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below some threshold A while the estimate is discarded as unreliable whenever the same quantity exceeds 
the threshold. 

Summarizing our second detection/estimation scheme: We first use the Neyman-Pearson test ([T]) to 
decide between Hq, Hi. Whenever we decide in favor of Hi we compute the estimate 9o{X) from Q 
and its corresponding quality index Co{X) from ([5]l; then we use the test in (17i to characterize the 
estimate as reliable/unreliable. 

C. MSE Cost and Uniform Prior 

If we call C{X\9) = ^^j^^x) conditional likelihood ratio, then all quantities entering in the two tests 
can be expressed with the help of C{X\9) and the prior probability vr(6'). We start with the likelihood 
ratio which is part of both tests and observe that we can write it as 

c{x) = = / c{x\e)-n{e)de. (i8) 



MX) 

From (|4]) we can see that the posterior cost C{U\X) can be computed as 

^^^'^^ - !c{x\e)7:{e)de ^^^^ 

suggesting that the Bayes estimator 9o{X) = wg'mljj C{U\X) and the corresponding optimum posterior 
cost Co{X) = mljj C{U\X) can be expressed with the help of the conditional likelihood ratio as well. 

Let us now examine the special case where for the cost function we adopt the squared error C{U, 6) = 
1 1 f/ — 6*11^ which leads to the MSE criterion. From fT], Page 143], we know that the optimum estimator 
6o{X) is the conditional mean Ei[0|X]. If we also assume the prior ■k{9) to be uniform over some known 
set Q with finite Lebesgue measure /i(i7) then 

^{x) = = f c{x\e)de 

MX) Jn 

J^c{x\e)de 

^^^^^_Lmx)-9fCiX\9)d9 



J^c{x\e)de 

J^\\9pC{X\9)d9 



UX)\\' 



J^C{X\9)d9 

We can see that does not enter in the computation of the estimate 9o{X) and its quality index 

Co{X). Although fi{f2) does appear in the likelihood ratio C{X), it is easy to verify that, in both tests, 
it can be transferred to the right hand side and absorbed by the corresponding threshold 7. We therefore 
conclude that no explicit knowledge of this quantity is necessary. Finally, we note that in the MSE 
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criterion, Co{X) is the conditional variance of 9o{X) which clearly constitutes a very reasonable quality 
index for the corresponding estimate. 

We have now completed the development of our theory that addresses the joint detection and estimation 
problem. To demonstrate the power and originality of our analysis, first we apply our results to the example 
of retrospective change detection and then in Sectionjlllj at a much greater extent, we use them to solve 
the MIMO radar problem. 



D. Example: Retrospective Change Detection 

Retrospective change detection is the problem where within a given set of data X = [xi, . . . ,xn] there 
is a possible time instant r where the data switch statistics from some nominal pdf f{X) before r to an 
alternative pdf h{X) after r. We consider r as the last time instant under the nominal regime. Given X 
we are interested in detecting the change but also estimating the time r the change took place. 

We should point out that retrospective change detection methodology is largely dominated by sequential 
techniques [3|. However, this constitutes a serious misusage of these methods since, in the retrospective 
formulation, the data are all available at once, whereas in the sequential setup the data become available 
sequentially. This means that by adopting sequential tests for the solution of the retrospective problem 
results in an inefficient utilization of the existing information. 

Let us now apply our previous theory. Note that for < r < A^, the two pdfs can be decomposed as 

f{X) = f{xi, ...,Xr) X f{Xr+l, Xn\xi, . . . , Xr) 

(21) 

h{X) = h{xi, . . . , Xr) X h{Xr+l, . • • , Xiy\xi, . . . , Xr). 

We first need to define the data pdf under the two hypotheses. Under Hq we are under the nominal 
model therefore, clearly, fo{X) = f{X). Under Hi and with a change occurring at r, we define the pdf 
/i(X|t) as follows 

/i(X|r) = /(xi, . . . X h{xr+i, . . .,xn\xi, . . .,Xr). (22) 



In other words, from the decompositions in pT] ), we combine the first part of the nominal pdf with the 
second part of the alternative. With this changepoint model, the data before the change affect the data 
after the change through the conditional pdf. This is the most common model used in change detection 
theory [8|. Note that t > N — 1 means that all the data are under the nominal regime (i.e. there is no 
change) whereas r = that all the data are under the alternative regime. Therefore, under Hi we have 
r S {0, . . . , — 1} with some prior {ttq, . . . , vr^v-i}. 
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Let us compute the quantities that are necessary to apply our tests. Using (2]_i we can write for the 
conditional likelihood ratio 

r(v\ \ KxT+l,---,XN\xi,...,Xr) 
J{Xt+1, ■ ■ ■,Xn\Xi, ...,Xr) 

suggesting that the likelihood ratio, from ( [T8| ), takes the form C{X) = "^^^q 7rrC{X\T). 

Consider now the estimation problem. We propose the following cost function C{U,t) = I^jjj^t}^ 
penalizing incorrect estimates by a unit cost. The average cost is clearly the probability to estimate 



incorrectly. Observing that l^jj^r} = 1 ~ '^{u=t}^ from ( [191 ) we can write 

C(Um = 1 - f'fl^"" = 1 - ^i^. (24) 
Consequently the optimum estimator that minimizes C{U\X) over U G {0, . . . , — 1} is 

fom = arg max C(X\U)7ru, (25) 

0<U<N-1 

which is the MAP estimator 171 Pages 145-150]; while the corresponding optimum posterior cost becomes 

Co{X) = l . (26) 

The classical test that treats the two subproblems separately consists in comparing the likelihood ratio 
C{X) to the threshold 7np in order to distinguish between the two hypotheses and use fo{X) to estimate 
the time of change. GLRT on the other hand compares maxo<{/<Ar-i £(X|C/) to a threshold with the 
argument of this maximization providing the estimate for the time of change. 

Applying our theory to this problem, for the single-step test we use fo{X) for the estimate of the 
changetime and either 

maxo<[/<iv_i/:(X|f7)7r(/ > 

" ~C{X) < " ^^^^ 

Ho 

Hi 

(X-l)C(X)+ max C(X\U)7:u = l, (28) 

0<U<N-1 < 
Ho 

for the decision. For the two-step scheme we compare the likelihood ratio C{X) to the threshold 7np to 
decide between the two hypotheses; use fo(X) for the changepoint estimate and finally apply 

maxo<c/<Ar_i£(X|[7)7r[/ > 

^^^^ ^ 

to label the estimate as reliable/unreliable. Both combined schemes resulting from our theory, are com- 
pletely original and make efficient use of all available information. 



or 
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III. Application to MIMO Radar 

A context where performing joint detection and estimation is of particular interest is in radar systems. 
Radars are often deployed not only to detect a target but also estimate unknown parameters associated with 
the target, e.g., position and velocity. Recent developments in radar systems equip radars with multiple 
transmit and receive arrays that considerably improve their detection power and estimation accuracy 
compared with the conventional phased-array radars. 

In this section we examine the merits of the tests developed in the previous section for enhancing 
the detection and estimation quality by employing multiple-input multiple-output (MIMO) radar systems 
with widely-separated antennas Q. In particular we are interested in the detection of a target, and the 
estimation of its location every time a target is ruled present. This is somewhat different from the more 
conventional approaches in MIMO radar systems, e.g., [10] and references therein, where the probe 
space is broken into small subspaces and the radar detects the presence of the target in each of the 
subspaces separately. In this approach as the location to be probed is given, one is only testing whether 
a target is present in a certain given subspace [10|. This necessitates implementing multiple detection 
tests in parallel, one for each subspace. In this section, we develop detectors and estimators based on the 
optimality theory discussed in the previous section which are used only once for the entire space. 

A. System Description 

We consider a MIMO radar system with M transmit and N receive antennas that are widely separated 
(satisfy the conditions in fTO', Sec. II.A]). Such spacing among the antennas ensures that the receivers 
capture unconelated reflections from the target. Both transmit and receive antennas are located at positions 
G M^, for m G {!,..., M}, and 6*5^ G M'^, for n G {1, . . . , N}, respectively, known at the receiver. 

The mth transmit antenna emits the waveform with baseband equivalent model given by ^/Esm{t) 
where E is the transmitted energy of a single transmit antenna (assuming to be the same for all 
transmitters); Jq' \smit)\'^dt = 1 and Tg denotes the common duration of all signals Sm{t). 

We aim to detect the presence of an extended target and when deemed to be present also estimate its 
position. The extended target consists of multiple scatterers exhibiting random, independent and isotropic 
scintillation, each modeled with a complex random variable of zero-mean and unknown distribution. 
This corresponds to the classical Swerling case I model |11] extended for multiple-antenna systems i9]|, 
ifTOl . The reflectivity factors are assumed to remain constant during a scan and are allowed to change 
independently from one scan to another. 
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We define 6 as the location of the gravity center of the target and dmniG) as the aggregate distance 
that a probing waveform Sm{t) travels from the mth transmit antenna to the target and from the target 
to the nth receive antenna, i.e., 



dmn{0) = ^J\\0-OU\i + \\O-Oal (30) 

The time delay the waveform Sm{t) is experiencing by traveling this distance dmn{G) is equal to 

rmn{e) = (31) 

c 

where c is the speed of light. When the target dimensions are considerably smaller than the distance of 
the target from the transmit and receive antennas, the distance of the antennas to each scatterer of the 
target can be well- approximated by their distances from the gravity center of the target. Therefore, the 
received signal at the nth receive antenna is the superposition of all emitted waveforms and is given by 

na 

M 

i9))+Wnit), (32) 

m=l 

where dmn is the path-loss with t] denoting the path-loss exponent; Wn{t) the additive white Gaussian 
complex valued noise distributed a^A/'c(0, 1); and gmn accounts for the reflectivity effects of the 
scatterers corresponding to the mth transmit and the nth receive antennas. It can be readily verified 
that {gmn} are independent and identically distributed (i.i.d.) with distribution A/'c(0, 1) lITOl . ||T2| . We 
note that we have assumed for the noises Wn{t) and the coefficients gmn that they have variance equal 
to 1. In fact if we use any other values e.g. cj^ and cr^ respectively then in the final test these quantities 
are combined with the transmitted signal power E in the form of Eag/a'^. Consequently, provided that 
in the general case and are known then, without loss of generality, we may assume = = 1 
and let E express the final combination. 
For n G {1, . . . , N} define 

Gn = bin, • • • ,9Mn] 

sUt,e) = VE 



Sl{t-nn{e)) SM{t - TMn{e))^ ^^^^ 



^AfcilJ-, o'^) denotes the distribution of a complex Gaussian random variable with mean /i = fir + j/J-i where the real and 
imaginary parts are uncorrelated (and therefore independent) Gaussian random variables with mean /ir,/ii respectively and of 
variance equal to 
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where we recall that dmn{(^) and Tmni(^) are known functions of 6 defined in ( |30l ), ( |3T] ) and A',A^ 
denote the transpose and Hermitian (transpose and complex conjugate) respectively of the matrix A. 
Under these definitions we can write 

rn{t) = G^ ■Sn{t,9) + Wn{t). (34) 

Let us now formulate the joint detection and estimation problem for the specific signal model we just 
introduced. 

B. Target Detection/Localization with MIMO Radar 

For < t < T, we distinguish the following two hypotheses satisfied by the received signals r„ (t) , n = 

l,...,iV, 

Ho : drn{t) = dwn{t) 

Hi : drnit) = G^^Snit, e)dt + dWnit). 
We have written the received signals in a stochastic differential equation form, since the {wn{t)} are 
Wiener (white Gaussian noise) processes. As we can see, when there is no target present the measured 
signals are pure Wiener processes, whereas with the appearance of a target we have the emergence of 
the nonzero drifts G^Sn{t,9). 

For simplicity, let us use f„ to denote the signal acquired by the nth receive antenna during the time- 
interval [0,T], that is, f„ = {r„(f), < t < T}. The collection of these N signals constitutes the 
complete set of observations, in other words, {fi, . . . , ttv} plays the role of the observation signal X of 
the previous section. Clearly, our goal is to use {fi, . . . , fA?} in order to decide between the presence or 
absence of a target and, every time a target is detected, to provide a reliable estimate of its position. 

To apply the theory developed in the previous section, according to Section iFCj we need to find the 



conditional likelihood ratio £(ri, . . . , rjv|0). The following theorem provides the required formula. 
Theorem 2: The likelihood ratio £(fi, . . . ,fN\0) of the received signals is given by 



n=l 

where 

rT 



QnW = / Sn{t,e)s^{t,e)dt, 

Jo 
Jo 



(36) 



Ik denotes the identity matrix of size K and |A| the determinant of the matrix A. 
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Proof: The proof is presented in the Appendix. ■ 
A final quantity that is of major interest for the next section is the appropriate definition of SNR. Note 
that, depending on the position of the target, the received signals r„(t) exhibit different SNR levels. This 
is due to the path-loss effect, which is particularly severe for distant targets. We therefore propose to 
measure the SNR by aggregating the signal and noise energies at the receivers but also averaging these 
quantities over all possible target positions 6 ^ Q. Specifically, by adopting the uniform model for 6, we 
define 



where from standard Ito Calculus the expectation in the denominator is equal to T. For the approximate 
equality we overlooked the boundary effects in the numerator, that is, we assumed that \sn{t — 
Tmn{9))\'^ dt = 1 for all Tmn{9) which, of course, is not true when 9 is close to the boundary of Q. If 
there is no path-loss, that is 77 = 0, then the previous equation reduces to the simple formula SNR ^ 
The transmitted energy E will be tuned through these equations in order to deliver the appropriate SNR 
level at the receivers. 

We have now developed all necessary formulas that enable us to use the results of Section II in the 
MIMO radar problem. In the next subsection we evaluate the joint detection/estimation scheme with 
Monte-Carlo simulations that cover various combinations of SNR values and number of transmit/receive 



antennas. We apply only the two-step test developed in Section II-B since, as we briefly argued earlier, 
it is more well suited for the MIMO radar problem. 



C. Simulations 

We consider the two-dimensional analog of the MIMO radar problem with two configurations consisting 
of M = = 2 and M = N = 3 antennas, where the mth transmit and the nth receive antenna are 
located at 0^ = [m, 0]' and 6J^ = [0, n]' (expressed in Km), respectively. 

The emitted waveforms are Sm{t) = -^e^~^ for t e [OjTs] where T, = 10~^sec is the signal 
duration. Moreover, we select an integration time T = 5xTs = 5x 10^^ sec. This integration limit can 
accommodate delays Tmn{9) that do not exceed T (for larger delays we simply measure noise during the 
interval [0,T]). The maximal delay defines a region i? in space where every point 6 £ [2 has at least 
one aggregate distance dmn{9), defined in ( |30| ), from one transmit and one receive antenna that does not 
exceed the value c xT ^ 150 Km. Actually, the points in space that have an aggregate distance from a 
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pair of transmit/receive antennas not exceeding 150 Km lie in the interior of a well defined ellipse. Since 
we have M x N pairs of transmit/receive antennas, we conclude that Q is the union of an equal number 
of such ellipses. By considering that all antennas are roughly positioned at the origin, all ellipses become 
circles and 17 can be approximated by a disc of approximate radius of 75 Km. 

As is the usual practice in MIMO radar literature, we assume r/ = 0, namely, no path-loss. This means 
that we are going to tune our energy parameter E through the simplified equation SNR ^^M. We 
consider SNR values -20, -10, and 10 dB. 

Assuming that the target position 6 is uniformly distributed within 17 and that for the cost function 



we employ the MSE criterion, we can use the formulas in (20 1 for the joint detection/estimation scheme. 



From pO] ) and p6} we observe the need for space and time integration. Both integrals will be evaluated 
numerically. For time integration we use canonical sampling and consider Lt points {tk} within the 
time-interval [0,T]. For integration in space we form a canonical square grid of points for 9. Denote 
with Lg the number of points {9i} that lie in the interior of the region 17. The two integrals are then 



approximated by sums. Specifically, the quantities in pS] ), for 6 = 9i, are approximated by 

QniOl) ~ ^ X V Sn{tk, ei)S^{tk, 9l) (38) 
and Rn{9i) under Hq (needed to compute the threshold 7np) takes the form 

Lt 

Rn {Ol) « ^nitk, Ol)AWn{tk), (39) 
k=l 

while for the same quantity under Hi we can write 

Rni^l) -J2s^{h,9i){G^Sn{tk,9o) X '^+Aw^{tk)}. (40) 

k=l * 

Parameter 9o denotes the "true" target position selected uniformly within 1? and 9i is one of the Lg 
grid-points in the interior of the same set. The coefficients Gn are selected randomly from a Gaussian 
A/'c(0,Im) while each Awn{t) is also Gaussian Afc{0,j^J. For each run, the quantities Gn,9o and 
Awn{tk) are the same for all 9i. For our simulations we use Lt = 500 time samples {t^} and a grid 
with cells lOKmxlOKm that generates 179 points {6*/} in the interior of 12. 



For the test of Section II-B according to ( [20| ), the likelihood ratio test is implemented as 

En — io.,,...,..i — i^- 
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(42) 



Every time a decision is made in favor of Hi we provide the following estimate of Oo 

. \Qn{90+lM\ 
\Qn{00 + lM\ 

with corresponding quality index 

M \Qn{ei) + iM\ 

Co = ^ ^ \\eof. (43) 

^ gi?«(0O(Q4eO+iA/)-^R40O 

\Qnm+lM\ 

The estimate 0o is characterized as reliable/unreliable depending on whether Co is below/exceeds the 
threshold A. 

We also consider the GLRT where we maximize the likelihood ratio £(fi, . . . ,f]\f\6) in ( [35] ) over 6 
and compare it to a threshold. The threshold is selected so that the corresponding false alarm probability 
is equal to a. We recall that GLRT provides ML estimates for 6 and, as we mentioned, cannot trade 
detection power for estimation. 

Monte Carlo simulations were carried out in order to study the performance of the different tests. For 
each SNR value, 200,000 simulations were implemented to validate our theoretical developments. In our 
simulations we fixed the false alarm probability to a = 10^^. The (conditional) MSB was computed as 
^ ^ ll^o — ^olP where K is the total number of cases where the combined test decided in favor of Hi,, 
(that is. Hi in the first step and Hi^ in the second). 

In Fig.[T]we depict the MSB normalized by the (approximate) radius of i? squared (75^) as a function 
of the fraction of reliable estimates, i.e. -p^j^^^^. The fraction value is controlled through the threshold 
A. Fraction value equal to 1 in our test corresponds to the performance of the classical approach where 
detection and estimation are treated separately. For the same value we also present the performance of 
the GLRT. We observe that for SNR = — 20dB we need to sacrifice more than 50% of our detections 
(more accurately in these cases we regard the estimates as unreliable) to reduce the MSB by a factor of 
2. For larger SNR values we can have significant (even enormous) gains. For example for SNR = dB 
by sacrificing 50% of the detections, in the 2x2 case we gain an order of magnitude in estimation 
performance while the same gain in the 3x3 configuration is achieved with only 25% reduction. We 
conclude from our simulations that apart the very low SNR case of —20 dB, the 3x3 antenna configuration 
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Fraction of reliable estimates 



Fig. 1. Normalized MSB as a function of the fraction of reliable estimates for different values of SNR. Configuration M = 
N = 2: optimum is solid and GLRT is •; configuration A-I — N = 3: optimum is dashed and GLRT is o. 

is preferable to the 2x2 since it can return significant performance gains. Finally, we observe that GLRT 
and the classical approach that treats the two subproblems separately have very comparable performance. 

IV. Conclusion 

We have presented two possible formulations of the joint detection and estimation problem and devel- 
oped the corresponding optimum solutions. Our approach consists in properly combining the Bayesian 
method for estimation with suitable constraints on the detection part. The resulting optimum schemes 
allow for the trade-off between detection power and estimation efficiency, thus emphasizing each sub- 
problem according to needs of the original application. Our theory was then applied to the problems 
of retrospective change detection and MIMO radar. In particular in the second application, intense 
simulations demonstrated the possibility to experience significant gains in estimation quality with small 
sacrifices in detection power. 
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V. Appendix 

Proof of Theorem^ We are interested in minimizing ^{60,61) defined in (|9]l subject to the two 
constraints / 6i{X)fQ{X)dX < a and / 5o{X)fi{X)dX < f3. We first note that if we have a pair 
{6o{X), di{X)} for which the second inequality is strict, then we can find another pair {6o{X), 6i{X)} 
which satisfies the second constraint with equality and has exactly the same estimation performance. 
Indeed if we select 5i{X) = {x)f^(x)dx ^^^'^^' ^o{^) = 1 ~ ^i{X), then we observe that since 
we assumed f 6oiX)fi{X)dX < /3 we have f 6i{X)fi{X)dX = 1 - f 5o{X)fi{X)dX > 1 - /3, 
suggesting that 6i{X) is a legitimate probability (because Si{X) is multiplied by a factor smaller than 
1 to produce 6i{X)), consequently the complementary probability 6o{X) is legitimate as well. The fact 
that the alternative pair has exactly the same estimation performance, namely ^7(^0, Si) = J{5q, 5i), can 
be verified by direct substitution. 

With the previous observation we can limit our search for the optimum within the class of tests 
that satisfy the constraint on the probability of miss with equality, that is, / 5Q{X)fi{X)dX = (3. 
Equivalently we consider only tests that satisfy the equality constraint J 5i{X) fi{X)dX = 1 — /3 on the 
detection probability. Under this equality, minimizing J{5q, 5i) is equivalent to minimizing the numerator 
/ 5i{X)Co{X)fi{X)dX in 

Due to the nonnegativity of Co{X) and our assumption that Co{X) does not contain any atoms we 
have that (lOi has a unique solution Ao > 0. Suppose that we are in the case where a > Pq{Co{X) < Aq) 
and consider a test {5o{X),Si{X)} that satisfies the equality J 5i{X)fi{X)dX = 1-/3. We can then 
write 

J 6i{X)CoiX)MX)dX - K{1 - p) 

= j 5i{X)Co{X)fi{X)dX-\o j 5i{X)h{X)dX 
5i{X)[Co{X) - \o]h{X)dX 



> j U [Co{X) - Xo]fi{X)dX 
= j lACo{X)fi{X)dX - \oPi{A) 

= j lMX)fi{X)dX-Xo{l- P), (44) 

where A = {Co{X) < Ao}. Comparing the first and the last term yields / 5i{X)Co{X)fi{X)dX > 
J l_A_Co{X)fi{X)dX, which proves that ( [TT] ) is the optimum since it minimizes the estimation criterion 
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and satisfies botli constraints. We observe in this case that, for the optimum test, the false alarm constraint 
can be strict. 

Consider now the case a < Po{Co{X) < Aq) and let us show that there is a pair A, 7 for which the 
test in ( [T2I ) satisfies both constraints with equality. We are first going to prove that for any A > Ao we 
can find 7(A) > to satisfy the equality constraint for the detection probability, namely 

Pi (|||y[A-Co(X)]> 7(A)) =1-/3. (45) 

Call ip{X,-f) = Pi([/i(X)//o(X)][A - Co{X)] > 7) - (1 - /3), fix A > Ao, then we observe that 
V'(A,0) > V'(Ao,0) = 0. Furthermore lim^^ooi^{X,l) = —(1 — /3) < 0. Consequently there exists 7(A) 
such that ( |45| ) is true. There are two pairs A, 7(A) which we can describe explicitly. From the definition of 
Ao we know that when A = Ao we have 7(Ao) = 0. Consider now A — )• 00 and assume that 7(A)/A — )• 7, 
then 7 is the solution to the equation 

■MX) 



This is true because the test in ( [T2| ), after dividing each side by A and letting A — 00 reduces to 
the likelihood ratio test with threshold 7. Since by assumption we have /3 > /3(a) where /3(q) is the 
probability of miss of the Neyman-Pearson test, we conclude that 7 > 7np. This suggests that 

^''^'>7l<Pof^>7.pl=a (47) 



Now we need to show that there exists a value for A and the corresponding threshold 7(A) that satisfy 
the false alarm constraint with equality, namely 

Po(||fj[A-Co(X)]>7(A)) =a. (48) 

Call (j){X) = Po([/i(^)//o(-'^)][A — Co{X)] > 7(A)) — a. Then, because of our previous analysis, it is 
easy to verify that (p{\) has opposite signs for A = Ao and A — c«, meaning that there exists a A > Ao 
such that 0(A) = 0, or that the false alarm constraint is satisfied with equality. 

To show that the test in ( [T2] ) is optimum, let A, 7(A) be the previous pair and consider any test 
{6o{X),6i{X)} that satisfies the equality constraint for the detection probability and the inequality 
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constraint for the false alarm. Then we can write 

J 5i{X)Co{X)h{X)dX - A(l - /3) + 7(A)a 

> j 5i{X){[eo{X) - X]h{X) + 7(A)/o(X)}dX 

> J iA{[Co{x)-x]fiix) + j{x)Mx)}dx 

lACo{X)h{X)dX - A(l -13)+ 7(A)a, (49) 



/ 



where A = { |^|xy[^ ~ ^o{X)] > 7(A)}. Again comparing the first and the last term, proves optimaUty 
of the test in ( [T2] ) and therefore concludes the proof of Theorem[T] ■ 

Proof of Theorem^ Due to independence across receivers for the noises {wn{t)} and the reflection 
coefficients {gnm} we deduce 

N 

£{n,...,fN\e) = '[lC{rn\9). (50) 

71=1 

It is thus sufficient to show that 

|Q,.W+I„| ■ 

Since G„ is random, we can first compute £(f„|G„,6') by conditioning on the coefficients G„ corre- 
sponding to the nth receiver and then average out Gn- For given Gn the received signal r„(t) under the 
two hypotheses differs only in the drift, consequently we can apply Girsanov's theorem lfT3l Page 191] to 
compute the corresponding likelihood ratio. We can treat the complex valued Wiener process as 
a two dimensional real valued Wiener process, with the real and imaginary part of the complex process 
constituting the two independent components of the two dimensional process. Since the corresponding 
variances, by assumption, are equal to 0.5, it is straightforward to show that 

£{rn\Gn,0) = e-fo\GSS„{t,eWdt+2Rc{f^[SSmG,.]drJt)) ^ g-G^Q„{0)G„+2Re(K«(e)G,.)^ (.52) 



where Qn{9),Rn{0) are defined in p6} 

In order to compute £(f„|0) from £(f„|G„, 9) we need to average out G„. We recall that the real and 
imaginary parts of G„ are Gaussian uncorrelated (and thus independent) vectors, each with mean and 
covariance matrix equal to 0.51^/- For notational simplicity we drop in all quantities their dependence on 
n and 9. Let us also define the following decompositions into real and imaginary parts G = Gr + jGi, 
R = Rr+jRi, Q = Qr+jQi and, finally, denote G = [G'„ G[]', TZ = [K, R[]' , Q = [Q„ -Q^; Q^, Q,]; 
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then we can write the previous likelihood ratio as follows 

C{r\G,d) = e~^'^^+^'^'^, (53) 

where we used the fact that Q, by being Hermitian, satisfies Q'^ = and = — Qj. We can now 
average out G by recalling that Q ~ A/'(0, 0.5I2m)- By "completing the square" we have 



1 

vr- 

g7^'(Q+I2M)-l7^ r g-(£;-(Q+I2M)-^7^)'(Q+I2A^)(G-(Q+I2^/)-^7^) 



Viq+i2m| J 

^' ' V|2(Q+I2m)| 



dg 



g7^'(Q+I2M)-l7^ 



(54) 



\/|Q + I2m| ' 

where the last integral is equal to 1 since it is the integral of a Gaussian pdf with mean (Q + I2m)^^ 
and covariance matrix 0.5(Q + l2Af)~^- 

From the nonegative definiteness of Q we have Y^QX > for any complex vector Y. Using the 
observation that for any real vector Z, it is true that Z'QiZ = 0, as a result of Q'j = — Qj, we can show 
that [y;,y/]Q[y;,y/]' = Y^QY > O where Y = Yr + jYi. Hence Q is nonegative definite as well, 
implying that Q + I2M is positive definite. 

Define two square matrices A, B of size M x M as the solution to the following two equations: 
(Qr+lA/)A— QjB = and (Q,.+Ij\/)B+QjA = Oj\/ (there always exists a solution due to the positive 
definiteness of Q + I2a/)> then by direct computation we can verify that (Q + l2Af)"^ = [A, -B; B, A] 
and (Q + Ia/)^^ = (Qr + Ia/ + jQi)^^ = A + jB. With the help of the previous equalities we have 
Tl'{Q + l2Af)^^^ = {Q + Ia/)^-R- This proves the correctness of the exponential term in ( [ST] ). 

What is left to show is that a/|Q + I2A/I = IQ + Ia/I- Since Q + l2Af = [Qr + lAf, -Qi; Qi, Qr + lAf], 
if p is an eigenvalue of this matrix with corresponding eigenvector [Y^, Y-]' then /> is a double eigenvalue 
because by direct computation we can verify that [—Y^,Yj.]' is a second eigenvector (orthogonal to the 
first and thus different) for the same eigenvalue p. Consequently the 2M eigenvalues of Q + I2A/ are of 
the form pi, pi, ■ ■ ■ ,Pm,Pm with p„ > (because of the positive definiteness of Q + l2A/)> implying 

\/|Q + l2A/| = rinil Pn- 

We can now verify that if p, [Y^, y/]' is an eigenvalue-eigenvector pair of Q + I2A/ then p, (Yr +jYi) is 
an eigenvalue-eigenvector pair of Q + Im- This suggests that p, {—Yi +jYr) must also be an eigenvalue- 
eigenvector pair for the same matrix. However, we observe that {—Yi + jY^) = j{Yr + jYi), which 
means that the two eigenvectors are co-linear and therefore coincide. Consequently for the complex 
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matrix Q + Im the eigenvalues are the pi, . . . , pM, meaning that the corresponding determinant satisfies 
|Q + Im| = Y[t=i Pn- This proves the desired equality for the two determinants, demonstrates the validity 
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